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Abstract 

We provide some theoretical extensions and a calibration protocol for our former dynamic optimal 
execution model. The Hawkes parameters and the propagator are estimated independently on financial 
data from stocks of the CAC40. Interestingly, the propagator exhibits a smoothly decaying form with 
one or two dominant time scales, but only so after a few seconds that the market needs to adjust after 
a large trade. Motivated by our estimation results, we derive the optimal execution strategy for a multi¬ 
exponential Hawkes kernel and backtest it on the data for round trips. We find that the strategy is 
profitable on average when trading at the midprice, which is in accordance with violated martingale 
conditions. However, in most cases, these profits vanish when we take bid-ask costs into account. 
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1 Introduction 


In the last fifteen years, the literature in quantitative finance has been enriched by many studies on optimal 
execution problems. The principle is as follows: one considers a particular trader who wants to liquidate a 
quantity xq of assets on the time interval [0,T]. Thus, if Xt is the position at time t, one has Xq = xq and 
Xt+ = 0: Xq > Q (resp. xq < 0) corresponds to to a sell (resp. buy) program. The trader uses an execution 
strategy of minimal expected cost, which should take into account the fact that large trades have an impact 
on the market price. The works of Bertsimas and Lo [H] and Almgren and Chriss [S] are pioneers in this area. 
They have been followed by several authors who suggested extensions of their framework, such as Obizhaeva 
and Wang [23] who considered a model that includes transient price impact. This feature allows to reproduce 
the mean-reversion that is observed in intra-day prices. On average, when a large trade impacts the market 
price, a fraction of this impact vanishes over time. 


In Alfonsi and Blanc [T], we introduce a model where other liquidity takers trade the same asset as the large 
trader, and share the same price impact profile as her. In this model, the volumes of incoming trades is 
described by a cadlag (right continuous left limits) pure jump process Nt, and the market price Pt at time t 
is given by 


Pt = Y. X 

T<t 


V 

Q 
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( 1 ) 
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where the times t are the jump times of the process N and AN^. = Nj. — Nt_ is the signed volume of the 
order at time r. Thus, 9 > 0 is a measure of market liquidity, v G [0,1] the proportion of permanent impact, 
and p > 0 the resilience speed of the transient part of the price. In [T], the order flow is modeled by a 
two-dimensional Hawkes process, which allows self and mutual excitation between buy and sell orders. An 
interesting feature of this model is that it accounts for herding behavior and meta-orders splitting, see Bacry 
and Muzy |7]. Namely, let N~^ and N~ be two nondecreasing cadlag pure jump processes that describe 
respectively the volumes of incoming buy and sell orders. We have N = N~^ — N~, and we proposed in [T] 
the following model for the respective jump intensities of 


Kj. — ^00 “b ^ ^ 

T<t 

— ^cxD “b ^ ^ 

T<t 
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+ l{AAr^>0}‘b’c 


AW 

mi 

AW 

mi 


^-P{t-T) 


^-P(t-T) 
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where Koo > 0 is the common baseline intensity of A+ and N~, /3 is the resilience speed of the intensity and 
(psjific ■ R'*' —t R'*' are measurable positive functions that encode intensity feedback. We assume that the 
sizes of orders are independent variables distributed according to a square integrable probability law p, on 
R+, and mi = xp(dx) is the average amplitude of the jumps of N. This price model is called MIH, as 
Mixed-Impact Hawkes. In this model, we provide a closed-form solution for the optimal liquidation strategy, 
and determine a set of conditions on v, p, (i,(ps,(pc that exclude Price Manipulation Strategies (as defined 
in pn]) from the model. These are referred to as the MIHM (Mixed Impact Hawkes Martingale) conditions. 

One of the benefits of the framework introduced in [T] is that it is possible to calibrate the model on financial 
data, without effectively trading (which can be costly). One only has to observe the order flow and price 
process of the market, and to estimate the price impact of trades issued by other participants, which is 
expected to be similar to the impact that the liquidating trader would have. The aim of the present paper 
is to conduct such a calibration on real stock data. This enables us to evaluate the realism of the theoretical 
price model of [T], as well as the performance of the optimal strategy in a practical context. Since our main 
goal is to confront the model to market data, we test the validity of our calibration protocol on simulations 
and we leave its mathematical justification for further research. 

Many studies have explored the estimation of Hawkes parameters in various contexts (see for instance Bacry 
et al. [S], Bouchaud and Hardiman HU, Reynaud-Bouret [5^, Lemonnier and Vayatis |13]). The present 
paper focuses on marked Hawkes processes used to model price jumps triggered by transactions in financial 
markets, where the marks of the jumps are either the traded volumes or the price jumps. As opposed to most 
Hawkes models in finance, price moves which do not correspond to trades are treated separately through 
the propagator function. Propagator price models have been studied extensively in theoretical frameworks 
such as Gatheral [15], Alfonsi et al. |4] and Gatheral et al. [16], Bouchaud et al. [9] and Farmer et al. |14| . 
However, to the best of our knowledge, very few empirical studies have described the form of the propagator 
curve, or only asymptotically. Here, we suggest an estimation protocol for the propagator and discuss the 
quality of ht of exponential and multi-exponential decays. We also describe the behavior of the curve on the 
first seconds, where it is found to have an increasing part. 

The paper is structured as follows. First, we present the model in Section |5J It extends the one considered 
in [T] to general decay kernels, while preserving most of its properties. Then, in Section |3| we describe our 
dataset and our calibration method. In particular, we explain how we slightly modify the original model to 
be in accordance with practical considerations. Section |T| validates our calibration procedure with simulations 
and discusses the calibration results on real stock data. Eventually, we test in Section [5] the relevance of the 
optimal execution strategy described in Section [2] and discuss whether it may constitute Price Manipulation 
Strategies, i.e. round trips that are profitable in average. 
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2 Model settings 


In view of its estimation to market data, we make the model of |T] more general by adding further parameters. 
First, even if it is appealing to see the price as the pure result of past trades, equation m is probably too 
restrictive and one should add some noise. Besides, we know that adding a martingale to the price process 
does not change the main results on the model, see Remark 2.6 in [T]. Second, we chose the resilience on 
the price and on the intensity to be exponential, and one may like to consider a priori more general decay 
functions. Thus, we consider the following propagator model for the price: 

Pt = -yANrG{t-T) + aWt. (4) 

^ T<t 

The process IT is a Brownian motion independent of N that takes into account the non-deterministic noise 
in limit orders and cancellations. The parameter cr > 0 tunes the volatility of this noise. The function 
G : R+ I—>■ R is the propagator function of the market, that encodes the average evolution of the price 
between two market orders, which takes form through limit orders and cancellations. As before, g > 0 
describes the market liquidity and allows to normalize G such that 0(0) = 1. The propagator model is the 
same as the one considered by Alfonsi et al [4] and Gatheral et al. m and generalizes CD- Similar models 
have been considered for instance by Bouchaud et al. [H] and Gatheral ng. In the same way, we consider a 
general decay function K : R+ i—>■ R+ for the intensities of and N~. Namely, we assume that the jump 
intensities of N'^ and N~ are respectively given by 

- ^OO H” ^ ^ 

r<t 

tvf. — f^OO H” ^ ^ 

T<t 

with i^(0) = 1. We also introduce the average self-excitation ts and the average cross-excitation tc 

poo poo 

is = f >ps{v/mi)^x{dv) and ic = / q}c{v/mi)ijL{Av). 

Jo Jo 

Therefore, the model presented in [Tj corresponds to the exponential decay functions G{t) = e~f’* and 
K{t) = By estimating more general functions G{t) and K{t) in the sequel, we are able to assess the 

relevance of the exponential decay assumption. 
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2.1 Markovian specification of the model 

Considering general decay kernels is very natural from a modeling point of view. Unfortunately, it generally 
leads to drop the Markov property of the price process, which is important in the context of optimal execution. 
Still, for completely monotone decay kernels, it is possible to get back Markovian dynamics for the price. This 
has already been studied in Alfonsi and Schied |3] for the price propagator model. Considering completely 
monotone kernels amounts to assume the existence of probability measures A (dp) and r(;(dp) on such that 

G{t) = v + {1 — ly) [ e“'’*A(dp), K(t) = f e~'^*w{dp). (6) 

Here, for the sake of simplicity, we consider probability measures with finite support. We can then assume 
without loss of generalitjl^ that 

p p 

G{u) = 1 / + ^ Ai exp(-piu), K{u) = '^Wi exp(-piu), (7) 

^Note that G and K may still include different decay speeds: one only has to include all the speeds in the pj’s and to set 
some weights Xi,Wi to zero if necessary. 
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with 0 < pi < • • • < Pp, V, Xi,Wi > 0 such that u + -^1 = 1 ~ ^ ■ • ■ >^'}’ 

introduce the following processes 

dDl = -p, Dldt + - diVt, (8) 

q 

dK^^*^ = -p* - Koo/p) dt + w^[^ps{dN;^/mi) + ip^dN^/mi)], (9) 

dK^^*^ = -pi - Koo/p) dt + Wi[(pc{dNj/'/mi) + (ps{dN^/mi)]. (10) 

We also define the process dSt = ^ dNt that describes the permanent impact component of the price. Then, 
it is easy to check from 0, m and ([S]) that 

Pt = St + Y^ D] + aWt, Kf = ^ 4^^, (11) 

and the process {P, S, D'^, ) satisfies the Markov property. 

Remark 2.1. In the general setting m and we implicitly assume that the stationarity conditions 
(ts + tc) /q K{s)ds < 1, G integrable are satisfied, so that the sums are well-defined. This is no longer 

±(d 

required in the Markovian case since the law of {Pt, St, D\, nf ;t > 0) is determined by the initial con- 

dition {Pq, So, Dq, Kq ). In the particular case Dq = 0 for all i, and only in this case, we have Pt = 

Po + I Y)o<T<t ^NrG(t - r) + aWt. Thus, if ]Do][G{t) - G'(oo)] < Pt for all i e {1, • • • ,p} and all t > to, 

then the approximation Pt ~ Pq + “ So<T<t ^Nj-Git — r) + crWf is reasonable for t > to- 

Besides the Markov property, the particular form ([71) enables us to calculate explicitly the auto-covariance 
function of the number of jumps as explained by Hawkes in m, Section 3. This auto-covariance structure is 
of empirical interest, and serves as a starting point for our calibration procedure, see Section 13.41 The total 
intensity T,t = 4 -\- Kf has the dynamics 

St = 2 koo + >- f Kit - s) dJs, 

J —OO 

where l = Ls + Lc is the average jump size of St, and dJt = [(ps + Pc)(dA^t^/”^i) + (Ps -f(pc)(dAft Ml)]/'' 
has jumps normalized to unity. We assume that the stationarity condition l J4 Kis)ds < 1 holds, see 
Theorem 1 in [TO], and that the intensity process (k/", Kt ) in its stationary state. We consider the symmetric 
auto-covariance function of the infinitesimal increments of J. It is defined for t > 0 by 

^{t) = lim ^E[iJt+h - Jt)iJt-T-eh - Jt-r)] - = dm ^E[Et [Jt-r-eh - Jt-r)] - , (12) 

h-^o+ m h^o+ h 

where k = Koo/( 1 ~ '-//?) is the common stationary mean of k~^ and k~. As derived in [TO], one gets the 
self-consistent equation on for r > 0, 

(t) = TKlK (t) -\-i f Kir — u)'^iu)du. (13) 

J —OO 

Proposition 2.1. Let us assume that K satisfies 0 with wi,... ,Wp > 0 and the stationarity condition 
^ < 1. Then, the autocovariance function is given by 

p 

‘^(t) ='^ajexpi-bj]T]). T gR*. (14) 

1=1 

The coefficients ai, ■ ■ ■ ,ap and bi, ■ ■ ■ ,bp are positive and determined as follows: bi <■■■< bp are the distinct 
roots of the polynomial functions PiX) = X4i^iipi — X) — LY/Ji=iWiY\j^^4k ~ K) and {aibi,--- ,apbp)^ = 
K B~^ (!,••• , 1)^, where B is the Cauchy matrix Bij = -4rz■ 

’ Pi ^7 
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Proof. Equation m then yields for r > 0 


exp(- 6 jT) =2lJ2^ 

j=i i=i 

Therefore, (US holds if we have 
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The first equation gives precisely P{bj) = 0. Since T’(O) > 0 from the stationarity condition and P{pi) = 
—i-wi Wk^iiPk — Pi) has the same sign as (—1)^ we have by the intermediate value theorem that 0 < 6 i < 
Pi < 62 < P 2 < • • • < Pp-i < bp < Pp. These coefficients are distincts and therefore the Cauchy matrix B is 
invertible. Let v = B~^ (1, • ■ • , 1)^: Vi is the Ph row sum of B~^. By Theorem 2 in Vi = —A{bf)/B'{b^), 
where A(x) = n*(a; - pf), B{x) = n*( X — b^). This gives in particular Ui > 0 and thus at > 0. Last, it is 
easy to check (iMll is the unique function satisfying (IT^ . □ 


In the mono-exponential case p = 1, Proposition 12.11 gives l = p — b, ab = {p + b){p — b)K, which yields 

^ —- X 12 ^ exp(-(p - i)|T|), 

2 p (1 - CP)^ 

as found by Hawkes in m- 


2.2 Trading strategies and a generalized no-arbitrage condition 

We now specify the trading rules in our model. We denote by (Pt) the natural filtration generated by the 
process {P, S, D\ ). As in [1], we consider a particular trader called “strategic trader” and denote by Xt 

the number of assets she holds at time t. We assume that the strategy X is (J^t)"^dapted, caglad, square 
integrable and with bounded variations. The caglad (left continuous - right limits) assumption means that 
the strategic trader is able to react instantly to the flow of trades. For simplicity and tractability, we assume 
that the trades of the strategic trader affect the price in the same fashion as other trades, but leave unchanged 
the flow of orders N. To be more precise, we now assume that 

dSt = -{dNt + dXt), dDl = -p, Dl dt + ^{dNt + dXt), 

q q 

-u (0 ("i) I 

but the intensities and Kf remain as defined by ([1]) and (1101) . The price as well as the intensities 
and Kf of buy and sell orders are still defined by (ED- Last, the cost of the trade AXt = Xt+ — Xt at time 
t is assumed to be given by 

= PtAXt + 

This yields the following cost for a liquidation strategy X on [0, T] (i.e. such that Xt+ = 0) 

C(A) = [ p^dXu + ^ V (AA,)2 - PtXt + A|, (15) 

where Vx is the (countable) set of discontinuities of X. 

When considering high-frequency trading, a standard approach is to define arbitrages as strategies that can 
make money on average, with no specific exogenous signal. Roughly speaking, one may expect that by 
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repeating such strategies one obtains a classical almost sure arbitrage. Thus, Huberman and Stanzl have 
proposed in 1201 the following definition of a Price Manipulation Strateqy: this is a strategy X such that 
Xo = Xt+ = 0 and E[C7(X)] < 0. 

Theorem 2.1. The model excludes Price Manipulation Strategies if, and only if Pt is a {Tt)-i^O'f'tingale 
when Xt = 0 for any t. In this case, the optimal strategy is the one given hy Theorem 2 (see also Section 
1.3) in Alfonsi and Schied 

Besides, under the specification dini) and CD of the order flow N = N'^ — N , the model does not admit 
PMS if, and only if, 

\/iG {l,...,p}, {is - Cc)wi = XiPi, - Pi-Dp = 0, (16) 

and ifs {y/iTT-i) — Tc {y/'mi) = (i-s — ts) 2 //wi for all y >0 such that Ve > 0, p,{{y — e, y + e)) > 0. 


This theorem extends Theorem 2.1 and Proposition 5.1 of [T] to completely monotone kernels G and K. 
Its proof relies on the same arguments that we recall briefly in Appendix 1C.II An interesting consequence 
of (IT6)) is the connection made between the price propagator and the decay kernel of the intensity. For general 
completely monotone functions ®, this yields in particular the following condition: 

Vp > 0, (is - tc) w(dp) = {l-v) pX{dp). (17) 

Thus, to exclude PMS, w{dp) has to be proportional to p A(dp) and therefore the decay speed of K should be 
higher than that of G, whatever their functional form (as soon as they are completely monotone). Besides, 
we can make the two following comments. 


First, by dividing both sides of equation CD by p, integrating on (0, +oo) and using Fubini’s theorem, one 
gets the necessary (but not sufficient) martingale price condition 


l-V = (is - ic) 


w{dp) 


= {is y (yj exp(-pt) dt^ w{dp) 


= {is-ic)J (^J exp{-pt) w{dp)^ dt 

poo 

= (is - ic) / K{t) dt =: DBR. 

Jo 


(18) 


This equation means that the proportion of transient impact should be equal to the directional branching 
ratio, which we define as 

I - I 

DBR = (is - ic) / K{t) dt = ^^ X BR, (19) 

Jo is + ic 

where BR is the usual branching ratio for Hawkes-based models that count positively price changes of both 
signs (see for instance Hardiman and Bouchaud [H]). This result is intuitive since the DBR represents 
the average number of “children trades of the same sign” for each trade, which, to obtain a diffusive price 
process, should be equal to the proportion of price impact that vanishes over time. Although it is only a 
necessary condition, equation (|18|) gives a quite general numerical criterion to assess empirically whether an 
observed price process is compatible with the martingale property, or rather persistent (DBR > 1 — jz) or 
mean-reverting (DBR < 1 — v). 


Second, the power-law kernels 

G{u) = v + {l- iy){l -I- CG X t)““, K{u) = (1 -I- Cif X 
are particular cases of with 


A (dp) 


p° ^exp(-p/cG) 

r(a) 


dp. 


r(;(dp) 


p*^ exp{-p/cK) 

r(i + e) 4+^ 


dp. 
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Equation m then yields 


a = e, CG = CK = c, 


Lq l, 


= l-u. 


ec 


Let us recall that if iL is a power-law, one must have e > 0 to obtain integrability, which is a necessary 
condition for the Hawkes process to be stationary. Also, in that case, the process can only have long-memory 
(i.e. non-integrable auto-covariance) if the Hawkes norm is equal to on^l and if e £ (0,1/2), see Bremaud and 
Massoulie, Theorem 1 in El. In that case, the auto-covariance decays asymptotically as t We thus 

reach exactly the same conclusion as Bouchaud et al. [9], who give the diffusive price condition /3 = (1 — 7 )/ 2 , 
where 7 is the decay exponent of the auto-correlation of trade signs, and l3 = a is the decay exponent of the 
propagator. Note that we used a totally different approach (absence of Price Manipulation Strategies), and 
that equation (HZl) is a possible generalization of their result to a wider class of kernels, within the Hawkes 
framework. 


The calibration results presented in Section |T] allow us to confront real stock data to the martingale price 
condition obtained above. In particular, it is easy to check whether the proportion of transient impact 
1 — ^ Ai is smaller, equal or greater than the directional branching ratio DBR. Although we do not 

expect the condition to be exactly satisfied in practice, we find it interesting to evaluate how much (and 
which way) real data deviate from the theoretical equilibrium. 


2.3 The optimal execution strategy 

In [T], we obtained an explicit characterization of the optimal execution strategy that minimzes E[C'(A)] 
among strategies such that Xq £ K and Xt+ = 0 when G{t) = and K{t) = e~^*. It is of interest to 
generalize this result to multi-exponential kernels ([7]). This is in principle possible. In fact, the model is still 
Markovian and Afline with respect to the state variable {Xt, Pt,St,D\, and the cost is still quadratic. 

As in [T], one should first guess the quadratic form of the cost function, then derive necessary conditions on 
its coefficients, and last run a verification argument. However, we know from Alfonsi and Schied [3] that the 
optimal strategy without the flow of trades (i.e. = 0) is already quite involved and is characterized through 

a matrix Riccati equation. In our context, the system of ordinary differential equations that characterize 
the cost function would be much more intricated, and one would presumably have to solve it with numerical 
methods, which are less efficient than closed formulas for high-frequency trading. However, in the particular 
case where the propagator is kept exponential 

p 

G{u) = V + {1 — ly) eyip{—pu), K{u) = Wj exp(—^d^u), (20) 

i=l 

with 0 < /3i < • • • < /Ip and wi,... ,Wp > 0, it is still possible to derive explicitly the optimal execution 
strategy. In fact, we can handle the same arguments as in [1] and obtain the following result, proved in 
Appendix 1C.21 

Theorem 2.2. Let at = Wi(ts — (-c) and H, the square matrix of order p defined by 

1 — LJ —Vi ^i,j 

We also define the two continuous matrix functions w 

A/fk _ nrk 

A;>0 ^ ^ A;>0 ^ ’ 

^We refer to Hardiman et al. na for a test of this property on market data, and to Jaisson and Rosenbaum m for a study 
of Hawkes processes with an Hawkes norm close to one. 

^When M is invertible, C{M) = M~^[Ip — exp(—M)] and uj{M) = Af“^[exp(—M) — 7p + M]. 
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Then, the strategy X* that minimizes the expected cost E[C(X)] satisfies a.s. and dt-a.e on (0,T), 


(1 - iy)x: =-[l + p{T- t)]Dt + ^[2 + p{T - t)] (23) 

+ 2 + l{T-t) ~ • • • ’ 

_ i _(0 —(0 

where 5\ = kJ — for i £ {1, ■ ’ ‘ jP} intensity imbalances. Moreover, the optimal strategy is fully 
characterized by eguation (1231) . 

Though restricted to ([20]), we believe that this extension of the result of [T| may be relevant for applications. 
In fact, on our dataset, there is not much gain to use the multi-exponential price propagator rather than the 
mono-exponential one, see Figure |TJ Instead, for the decay kernel of the intensity, considering an exponential 
mixture allows to produce a richer variety of autocovariance functions, see Figure |31 


3 Calibration method 

3.1 Description of the dataset 

We consider tick-by-tick data provided by the French investment bank Natixis, to which we are grateful. 
The data contains all the changes in prices and volumes of the best bid and best ask, for two actively traded 
French stocks: BNP Paribas and Total. 

The data is selected between 11a.m. and Ip.ni., for every trading day between January and September 2012 
and 2013. We exclude the three last months of the year, where activity decreases on average, along with the 
months where the tick size deviates from 0.005 euros. The two-hour window around noon is chosen to obtain 
a rather stable and uniform behavior of market activity, see e.g. Lehalle and Laruelle [22], p. 112. This way, 
for each stock separately and with minimal data treatment, we can reasonably assume that each two-hour 
window of trading is a realization of the same random price process. 

In the initial dataset, for each stock separately, each line corresponds either to an update in price and/or 
quantity at one of the best queues (triggered by a market event such as a market order, a limit order or 
a cancellation), or a new trade executed for a given volume at a given price. The time stamps for these 
updates are precise to the millisecond. We reduce this data by aggregating the events happening on the 
same millisecond: we only keep track of the best prices at the beginning and at the end of each time stamp, 
which yields the aggregated price impact of the events that happened “simultaneously”, i.e. on the same 
millisecond. Similarly, we sum all the volumes that were executed on the same time stamp. We obtain a 
simplified sequence of market events, among which a minority is associated to a traded volume and/or to a 
price change. 

A correspondence should be clarified between the theoretical items of the models of [1] and Section |2 and 
actual financial data. Different possibilities may be relevant, but our choices are the following: 


• We define the “market price” Pt as the midpoint price, i.e. the average of the best bid price and the 
best ask price at any time t. 

• We only consider time stamps where the midpoint price jumps. In other words, we ignore the trades 
and cancellations that do not empty either the best bid or the best ask, as well as the passive limit 
orders that do not define a new best price. For the stocks that we consider, this gives an average latency 
of one to four seconds between two consecutive time stamps. This is in agreement with the time scale 
that is thought of in the theoretical model of [I], which is not of ultra-high frequency. 



• We express the time in hours, and note T = 2 the length of the window that we consider for each trading 
day. Throughout the paper, we note t S (0, T) the time stamps which correspond to midpoint jnmps 
triggered by trades, i.e. by limit orders that cross the spread or by market orders. These correspond 
to the jumps of the process N of the theoretical model: they are marked by both a price jump AMr 
(of one or several half-ticks), and an executed volume AVr > 0 expressed in number of shares. The 
time stamps of other price jumps are noted 9 S (0, T). They are triggered by cancellations and passive 
limit orders, with no executed volume, and they are assumed to enforce on average the deterministic 
resilience effect as in [^. Between two trades, the deviation of the price from this deterministic average 
is considered as a noise process, modeled using an arithmetic Brownian motion. 

Some key statistics for these items are given in Table [T] for BNP Paribas and Total. 


Stock 

BNP Paribas 

Total 

Year 

2012 

2013 

2012 

2013 

Average midprice 

32.4 

44.9 

38.2 

39.0 

Tick size 

0.005 

0.005 

0.005 

0.005 

Number of mid. changes per hour 

1909 

1699 

1209 

929 

Proportion due to transactions 

10 .0% 

7.9% 

7.6% 

6.9% 

mi 

776 

636 

978 

963 

m2/mf 

3.38 

4.69 

4.30 

6.72 

Average size of the first queue 

1398 

1136 

1710 

1779 


Table 1: Table of statistics for the stocks BNP Paribas and Total on the periods January-September 2012- 
2013, between 11 a.m. and 1 p.m. January 2012 is excluded for BNP Paribas because the tick size dropped 
below 0.005. We give the proportion of midpoint changes which are triggered by trades, the remaining 
proportion being triggered by cancellations or passive limit orders, mi is the average volume of transactions 
that trigger price moves, and m2 is the average squared volume for these transactions. The greater the ratio 
m2/mf, the more variance in the distribution of traded volumes. 


3.2 Overview of the calibration process 

One specificity of the price model given by equation ([4]) is that it is composed of two separate components: 

• The point process N for the trades that trigger the price moves, for the which time stamps r and the 
marks (the price jumps AM^ and the executed volumes AVr) are modeled and estimated jointly, 

• The propagator model, which conditionally to the midpoint jumps due to trades, is a continuous-time 
linear regression model with a Gaussian noise process aWt- 

Therefore, the trades are modeled using marked Hawkes processes, and conditionally to them, the price is 
Gaussian. This segmentation has at least three advantages. First, the calibration process is simpler since the 
two parts can be estimated independently, which significantly reduces the dimension of the problem. Second, 
the estimation results on each side are robust to the choices made in the other. For instance, if one wants to 
modify the Hawkes modeling for the trades, then our results for the propagator are still valid, and vice versa. 
Eventually, the results of Section 12.21 include some theoretical links between the Hawkes parameters and the 
propagator, and it seems more rigorous to confront these links to our calibration results when the two parts 
are estimated independently. 

Our calibration protocol as a whole being somewhat sophisticated, we test its validity and robustness by 
running it on simulated data. In Sections |4] and [Sj we give the results of our analysis for these simulations as 
well as for real hnancial data. 


9 











3.3 Estimation of the propagator 

3.3.1 Framework 


In this section we explain how the propagator model introduced in Section [2] can be adapted for practical 
applications, in particular for its calibration. This requires to consider the two following points: 

• In practice, the price impact of transactions is not proportional to their volumes. It is typically of a 
few ticks, while the volumes span a wider range of values. Therefore, one must choose between “price 
resilience” and “volume resilience” as in Alfonsi et al. [2]. The first choice corresponds to modeling 
the mean-reversion property of market prices, the second describes how liquidity “regenerates” after a 
trade, and the two are only equivalent for linear price impact. 

• The evolution of the price between two transactions is very noisy, and the propagator model only 
explains a part of its variance. Therefore, we need to control the variance of the estimation to obtain 
satisfying calibration results. 


For the first point, we choose to model price resilience, which is easier to measure in practice and has been 
considered more often in the literature. This boils down to replacing AN^/q by the midprice jumps AM^ in 
equation ([¥]). For the second point, an intuitive possibility consists in restraining the propagator regression 
to a finite time window Arw > 0, and to assume that the model predicts the price increment Pt — Pt-An-w 
for t > Arw instead of Pt — Pq. If the noise is an additive Brownian term aWt, this fixes the variance of the 
predicted variables to ct^Ar-vv instead of t S [ArwiT"]- We obtain the modified price model 

Pt = ft-Anw + I] AMrG{t-T) + a{Wt - Wt-A^^)- (24) 

t —ARW<T<t 

Of course, Arw must be such that G(Arw) — G(oo) is small compared to G(oo) for the model to be a 
meaningful approximation of the original model (|3]), see Remark 1 2. II This condition also allows to avoid bias 
in the estimation of the propagator G. We fix Arw = 0.5 hours (30 minutes) throughout the sequel of this 
paper, basing ourselves on preliminary observations that we do not detail here. Note that within the range 
Arw € [0.1,1], the choice of this parameter has little impact on the results. One can verify a posteriori that 
our estimations of G are compatible with G(0.5) — G(oo) <C G(c»). 


The predicted price increment between t — Arw and t is given by 

Pt-Pt-A^^ = Y. ^M^Git-r) (25) 

t — ABW<T<t 

where Pt-An-w is the real midpoint price at time t — Arw, taken directly from the data. Equation (12411 
becomes 

Pt = Pt + c7(Wt-IFt_ARw)- (26) 

Conditionally to Pt-A^w to process M, one has Pt ^ Af{Pt, ct^Arw)- In this setting, the Maximum 
Likelihood Estimator of G is equivalent to the Least Squares Estimator. We thus minimize numerically on 
the parameters of G the quadratic error 


f(G)= [Pe{G)-Pe]\ (27) 

^B.Vi<S<T 

where the 0’s are the occurrences of price jumps due to cancellations or passive limit orders. To get a better 
understanding of the shape of the propagator, we first estimate G in an “unconstrained” manner, i.e. as the 
linear interpolation of a discrete set of points. Thus, we model G as 


i-i 


_1 + P~ P) 9 i +1 

G(t) - ffil[t,,ABw[(‘) + ^^1 


2=0 


^2+1 




hdW- 
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where ti, - ■ ■ ,ti are fixed a priori and {gi, ■ ■ ■ ,gi) is the parameter to estimate. We see that the resulting 
curve, which is given is Section 0] for stock data, has an increasing short-range part, and switches to a 
decreasing mode after a few seconds. One has G(0) = 1, but G reaches a point above unity before it enters 
its decreasing regime. Let us recall that in an idealized model without bid-ask spread, Alfonsi et al. |4j and 
Gatheral et al. [16| show that G has to be decreasing and convex around zero to exclude PMS and some 
market instability. This is not the case on our dataset. We interpret this as the fact that after a trade, the 
new bid-ask is generally formed around the impacted price. Thus, during a few seconds, limit orders and 
cancellations tend to impact the midprice in the same direction as the trade. This motivates us to distinguish 
the propagator G{t) and the functional form of its long-range decay that we call the resilience, noted R{t). 
This way, we can allow R{0) > 1 and impose that R is decreasing. One can then link G and R with a simple 
linear interpolation between t = 0 and t = Ladj) with Ladj > 0 the “adjustment lag” of the market 


G(t) 


1 + (i?(Ladj) - 1) 





This choice has the merit that once Ladj is fixed, only the resilience curve needs to be estimated since G is 
characterized by R. Therefore, to estimate R with an imposed decreasing functional form, we place ourselves 
in the following version of the price model 


Pt=Pt-A^^+ Y. ^MrRit-T)+ Y 

t ^'RW <C.t i/adj ^ 


1 + (i?(Ladj) - 1) 



+a(W*-W*_A,w). 


We consider two types of parameterization for the resilience R{t)-. 


• The mono-exponential curve 

i?(t) = 7 [1 - A(1 - exp(-pt))], (28) 

with three parameters 7 , p > 0, A G [0,1]. 7 is an amplification factor, p is the resilience speed of the 
market, A is the transient part of the price impact of trades, and u = 1 — A is the permanent part. The 
mono-exponential curve is the type of resilience considered in the theoretical model of [T|. 


• The multi-exponential curve 


R{t) =7 1 


P 

^A*(l - exp(-pjt)) , 


(29) 


is a generalization of the previous one, with 2p + 1 parameters 7, pi, • • • , pp > 0, Ai, • • • , Ap G [0,1], 
E.A, < 1. For 1 < i < p, Ai is the proportion of transient impact that decays at speed pi, and 
ly = 1 — Ai is the proportion of permanent impact. 


For both parameterizations, we estimate a posteriori the volatility cr of the Brownian noise with 



n 


E 


Pt -Pq- E GiT 

. _ 0<t<T _ 

n X T 



(30) 


where n is the number of days of the sample, and for i G { 1 , • • • , n}, P* and M* are respectively the real price 
and the midprice jump process for day i, and the r’s are the jump times of M*. Also, since the prediction 
model defined by (I25|) and (1261) can be seen as a continuous-time linear regression, where the explained 
variables are the price increments Pt — Pj-Arw the regressors are the past price jumps AM,, triggered 
by trades, we can evaluate its quality of fit using a usual analysis of variance. We define the value as 


n 

E E K - Pe? 

2 -1 ^—1 Arw<^<T’ 

r = \ — - 

' ^ n _ 2 ’ 

E E [p^-p^arw-^ 

* —1 Ar.w<^<7’ 


(31) 
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where for z S { 1 , • ‘ ‘ jIT-}, is the predicted price process for day i, and 

_ 1 

E E (Ps-PLa^J 

2^1=\ Ir I ARw<e<T 

is the average price move between B — Arw and 9, where the 0 ’s are the times of price changes with no 
executed volumes and and is the number of such price changes on day i. Note that since there is no 
constant in the regression model ((26)l . the could theoretically be negative, but this is not the case in 
practice. The constitutes a useful comparison criterion between different estimated propagators, and we 
use it in Section 21 

Now that the global practical framework is set, the estimation protocol for G needs to be detailed. This is 
the object of the following section. 


3.3.2 Estimation protocol 

We use a multi-step estimation protocol, that mainly resorts to the minimization of the quadratic error 
£ defined in ((27|l . When G{t) is linear with respect to its parameters, £ is quadratic and one step of 
Newton-Raphson’s algorithm is enough to find the minimum (see Appendix |A|. When the dependency in the 
parameters is non-linear, we first use grid minimizations to find a suitable starting point for the algorithm. 

As a first step, we estimate the “unconstrained” propagator curve. Then, we estimate the resilience curve 
using the two parameterizations presented in Section [3.3.II 

Estimation of the unconstrained propagator curve 

We first estimate G by the linear interpolation G 


i-i 


G{t) — 5il[t,,T[(0 + E 


i=0 


{tj+i - t)gi + {t- ti)gi+i 
ii+l ti 


'-[tiM 


k[W- 


For ti,--- ,ti fixed a priori, G is linear with respect to ( 51 ,-•• , 3 ;). Thus, one step of Newton-Raphson’s 
method (see Appendix I A. II) determines the parameters that minimize the quadratic error £(G). To approx¬ 
imate the long-range propagator, we choose a uniform grid ti = i/l with I = 20 on the interval [0,0.2]. On 
the other hand, for a zoom on the beginning of the curve, we concentrate the t^’s near zero. 

Estimation of the multi-exponential resilience curve 

The simultaneous estimation of multiple pi’s being too unstable, we choose to fix four components associated 
to four simple characteristic time scales (the pi’s are expressed in inverse hours): pi = 6 (ten minutes), 
P 2 — 60 (one minute), pa = 120 (thirty seconds) and p 4 = 360 (ten seconds). We then assume that the 
vector (pi, • • • , P4) is rich enough to represent all the relevant time scales in our framework, and we focus on 
the weights Ai, • • • , A4 associated to each scale to characterize the decay of the curve. The multi-exponential 
resilience given by equation (1291) becomes 


4 

R{t) = 77 + exp(-pit), 

where we re-parameterize v = 7(1 — > 0 Ai = yAi > 0. Reciprocally, one has 7 = 77 -|- 

and Ai = Ai/ 7 . Since the p£s are fixed, the resilience curve R(t) is linear w.r.t. the parameter (77, Ai, • • • , A4) 
that remains to be estimated, thus Newton-Raphson’s algorithm (see Appendix lA.21) converges with a single 
iteration. We then select the significant pi’s as follows: 
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1. A first estimation yields a “full” parameter (^, Ai,--- ,A 4 ). Some of the resulting A^’s may be non¬ 
positive, which is incompatible with the model. 

2. While there exists i such that Ai < 0, we remove the pi corresponding to the minimal Xi, and we launch 
the algorithm again with one less parameter. 

3. Eventually, we have selected one to four “significant” pi’s, of associated weights A^’s that are positive, 
and the estimation process is complete. 

Since each of these steps only take one iteration of Newton-Raphson’s algorithm, the whole estimation protocol 
for the multi-exponential curve is quite fast. Therefore, in order to estimate the market adjustment lag Ladj) 
we can conduct the estimation several times for Ladj on some discrete grid, and compare the regression r^’s 
as defined by (ISTI) . The result associated to the maximal gives the parameters 7muiti, Amuiti and Pmuiti for 
the multi-exponential resilience, along with the adjustment lag Ladj- 

Estimation of the mono-exponential resilience curve 

The multi-exponential estimation presented above serves as a starting point for the following. The market 
adjustment lag Ladj is already estimated, along with the associated set of parameters Amuiti, Pmuiti for 

the multi-exponential resilience curve. We set 


A* 




‘multi A 


(32) 


‘multi 5 


A 


as a starting parameter for the mono-exponential estimation. As in the multi-exponential case, we re¬ 
parameterize (E51) as 

R{t) =T! + X exp(—p<), 

with V = 7(1 — A) > 0 and A = 7 A > 0. We then proceed as follows 

1. We use Newton-Raphson’s algorithm to minimize the quadratic error on the whole parameter (I7, A,p) 
(see ADDendix lA.2.2l for p = 1 exponential component). If the starting point is convex and the algorithm 
converges to a satisfying level, we proceed directly to Step 6 . Else, we go to Step 2. 

2. Keeping p fixed to its starting value ((5^ . the dependency of R{t) on v and A is linear. Thus, with one 
step of Newton-Raphson’s algorithm, we get the optimal values of v and A for the current value of p. 

3. For 7 = 17 -|- A fixed by Step 2, A initialized to A /7 and p as in (|32l) . we minimize the quadratic error 
(A, p) I— £{X^ p) on a local two-dimensional grid in the vicinity of the starting point. 

4. The pair that reaches the minimum of the error grid at Step 3 is again used as a starting point to 
Newton-Raphson’s algorithm, to determine the optimal (A,p) for the current fixed value of 7 , using 
the “unit” mono-exponential parameterization of Appendix I A. 2. ll We actualize (A, p) to this optimum, 
along with v = 7(1 — A) and A = 7 A. 

5. The parameter [v, A, p) is now in a region where the quadratic error is more likely to be convex. 
Therefore, we use this new starting point for an error minimization using Newton-Raphson’s algorithm 
on the whole parameter. 

6 . We obtain the parameter 7 mono, Amono, Pmono for the mono-exponential resilience curve. 

The above estimation protocol for the mono-exponential resilience curve may seem complicated: in particular, 
it is more subtle than the multi-exponential estimation. The reason for this is that we want here to determine 
the most significant characteristic time scale of the resilience through the parameter p. The dependency of 
the quadratic error E on this parameter being non-linear, nothing guarantees a priori that Newton-Raphson’s 
algorithm (or more simply a gradient algorithm) has a convex starting point, which is a necessary condition 
to ensure its convergence. Hence we have to proceed more carefully and introduce several intermediary steps. 
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3.4 Estimation of the Hawkes parameters 

3.4.1 Framework 


Independently of the propagator, we also estimate the parameters of the Hawkes-based model presented in 
Section [2] for the price jumps due to transactions. We choose the self-excitation functions (ps and tpc to be 
afhne, i.e. 

(fisix) = (l)°s+ (t>lx , <Pc(a;) = (j)l + (t)lx. (33) 

In the standard Hawkes framework, self-excitation in the order flow is not marked, i.e. only the constant terms 
4>^, 4>^ appear in and ipc- In spite of its simplicity, the affine structure allows us to underline the deviation 
from the standard Hawkes benchmark, and to detect an increasing part in the self-excitation function. 

As pointed out in Section 13. 3. 11 there are two possible interpretations for the marks associated to the jumps of 
N. Since each of these jumps corresponds to a price jump due to a transaction, they are all associated to two 
positive variables: the price impact on the one hand, and the traded volume on the other hand. Therefore, 
we estimate three sets of parameters for different versions of the Hawkes model (unit marks, volume marks, 
and price marks), each with a different practical interpretation of the intensity jump terms. Precisely, we 
replace in ([H]) and (ITUl) at the jump times t by either of the three possibilities 


^s/c,unit’ 


aO 

^s/c,vol. 


'i^sVc,vol.l^kt|/TOi, (j)[ 


°/ ■ ■ 
s/c,price 


^s/c,price 


I AM* 


where mi is the average executed volume and m is the average price impact. 


(34) 


3.4.2 Estimation protocol 

Our estimation protocol for the Hawkes part of the model is then as follows: we first estimate the mono¬ 
exponential Hawkes model K{u) = exp(—/3u), which allows us to estimate the Hawkes norm and its repar¬ 
tition in terms of self and cross-excitation, and to select the optimal mark type for the jumps. Then we 
estimate the multi-exponential Hawkes model K(u) = 'X’i exp(—/3iit) with the (3iS fixed a priori. 

Mono-exponential kernel 

Let us consider the mono-exponential Hawkes model of equation ([2|), for which the Hawkes decay kernel 
is K{u) = exp(—/3 m), /3 > 0. We first focus on the parameters of the total intensity Et = «:)*' -|- k/" by 
aggregating all the price jumps due to trades, regardless of their signs. In the mono-exponential case, one 
has 

dEt = -/3 (Et - 2/Coo) dt + l dJ*, 

where l is the average excitation, so that the jumps of J have an average of one. We use a Generalized 
Method of Moments (GMM) to estimate /3, Koo and t. We divide the time window [0,T] of length T = 2 
hours in 720 bins of length h = 1/360 (i.e. ten seconds). Then, we compute the number AJ/ of price jumps 
due to trades in the time bin [{I — l)h,lh\, Z S {1, • • • , [T//iJ} on day i G {1, • ‘ ‘ ^n}, for each time bin and 
each day. If I is the row index and i is the column index, we obtain a \ T/h\ x n matrix of which the entries 
are the positive numbers AJ/. We normalize this dataset by dividing each column by its mean value and 
multiplying the whole matrix by the original global mean value, so that the global mean is unchanged and 
each column has the same mean. We first compute the empirical mean A J and variance V of the discrete 
process A J 


AJ = 


n X [r /h\ 


n [T/h\ 

E E 


[T/h\ 


V = 


i=l 1^1 


n X [T/h\ - 1 


EE [Ai/-AJ 


Z=1 
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The average jump intensity 2k of the total jump process is obtained with the formula 2k = IS.J /h. Besides, 
the empirical auto-correlation function of A J is given by 


V/CG {I,--- ,fc„ax}, ^{k) 



nx -k) 


n VT/h\ 


^ ^ aJ/ aju 

i—1 



(35) 


where fcmax = 36 is the maximum lag (so that the maximum range k^^yji = 0.1 equals six minutes). Using 
the results of Da Fonseca and Zaatour m for mono-exponential Hawkes processes, we have that decays 
as exp(—(/3 — i,)k). Therefore, the exponential fit of the empirical curve ‘^^{k) yields an estimate oi d := f3 — i. 
Then, we also get from [12] 


This relation can be inverted to obtain an estimate for /3: if we note z/i = (1 — exp{—dh))/d, we get 


13 = d 


\ VI{2K)-Zh 
h- Zh 


Then, l = (3 — d and Koo = (l~t//3) n can be deduced from the above equation. We also obtain the 
mono-exponential branching ratio 

HUmono — 13 • 

Keeping (3, l and Kqo fixed to these GMM estimates, we now turn to the bi-dimensional intensity model Q. 
We use Maximum Likelihood Estimation (see Appendix |B]) on one-dimensional grids to determine the self 
and cross-excitation parameters: 


1. We determine the proportion u G [0,1] such that ts = u t, ic = (1 ~ u) t maximize the likelihood of 
the two-dimensional intensity (k^, k~), where is and tc are respectively the average self-excitation and 
cross-excitation parameters. 

2. For volume marks and price marks separately, we determine the proportion Ug G [0,1] such that 

= Ug is, ipl = (l—Ug) is maximize the likelihood of (k^, k~), where t/fg , (j)l are defined in equation (1551) . 
Similarly, we determine the optimal proportion Uc for </)() = Uc ic, = (1 ~ ^c) tc- For tg and ic fixed, 
we obtain the optimal constant and linear parts for self and cross-excitation, for the two possible types 
of marks. 

3. The likelihoods obtained for the three models are then compared to determine which of the unit / 
volumes / price marks yield the best model. 

Eventually, we obtain estimates for all the parameters /3mono, ^toomono,'('smono> ^smono;'('cmono>'^cmono of the 
mono-exponential Hawkes model, along with the optimal type of marks. 

Multi-exponential kernel 

We turn to the multi-exponential Hawkes model K{u) = exp(—/3iu). As in the case of the estimation 

of the multi-exponential resilience in Section [3.3.21 we fix four /3i’s associated to four simple characteristic 
time scales. In fact, we choose the same time scales as for the resilience: /3i = 6,/32 = 60,= 120 and 
Pi = 360. We then calibrate the wds associated to each Pi, and these weights tune the shape of the Hawkes 
kernel. 

The results of the mono-exponential estimation are used to select the type of marks (unit, volume or price) 
and to get a starting point for Kqo, PgiPl^Pcy 4^1 tfi^ branching ratio HR. The starting point for the wPs 
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is chosen to be uniformly distributed 


BRn 


v^4 1 /4 


X 


1 

4’ 


with a scaling that matches the initial branching ratio. Then, we maximize the likelihood of the model on 
the parameter (/Coo,rci, W 2 , W 3 , rc 4 ) using Newton-Raphson’s algorithm, as explained in Appendix [B] We use 
the same selection method as for the multi-exponential resilience estimation of Section 13.3.21 if at least one 
of the iCj’s is non-positive, we delete the Pi associated to the minimal Wi and launch the algorithm again, 
with one less parameter. Finally, we multiply ((/)g, , 0°, 0^) by the sum of the remaining w^’s, and scale 

the latter to one. Without changing the overall model, this imposes Ar(0) = 1 for the Hawkes decay kernel 
K. We obtain the parameters /3muiti, Wmuiti, '/'smuiti: </’cmuiti> for the multi-exponential 

Hawkes model. 


4 Calibration results 


4.1 Description of the results 


This section is dedicated to the presentation of our calibration results. The calibration method of Section |3] 
is first applied to simulated data to test its validity, and then to actual financial data from French stocks. 
We also provide some qualitative comments. For each simulated dataset and each stock, the results are 
summarized in tables, plus a few graphs for BNP Paribas. The content of the tables is explained below. 

Adjustment lag table: This table gives the regressions r^’s of the multi-exponential resilience curve, for 
several values of the market adjustment lag Tadj- R is used to select the optimal value of Ladj on a discrete 
grid. 

Resilience table: The resilience table gives the estimation results for the propagator. We give the selected 
adjustment lag Ladj and the estimated parameters for the two types of resilience curve 


^mono (t) — 'Tmono [1 mono (1 - exp(- 

' P mono m, 

Rnmlti(t) = 7multi 1 ~ ~ ®^P(~^'multi^)) 


along with the estimated volatility a of the noise and the regression , defined respectively by equations (1501) 
and (lOTl) . 

Marks table: In this table, we give the maximized log-likelihoods per point /lunit, >Cvoi. and /Iprice for each 
type of mark (unit, volumes and price jumps), in the mono-exponential Hawkes model. It serves as a selection 
criterion for the optimal type of mark. 

Intensity table: This table gives the estimated parameters for the Hawkes model described in Section lOTTl 
for both the mono-exponentiel decay kernel K(u) = exp (—/3m) and the multi-exponential one K(u) = 
Wi exp{—Piu)■ We also give the maximized log-likelihoods per point £mono and £muiti, which can 
be compared to one another or between datasets to quantify the quality of ht of the Hawkes model. Eventu¬ 
ally, we give the branching ratio BR and the directional branching ratio DBR defined by equation ([T5)) , that 
are obtained with the multi-exponential parameterization. 


4.2 Simulated data 

We first give in Tables [2] and [3] the results of our calibration protocol on two datasets simulated with the price 
model (Ill). In each table, the first column gives the “real” simulation parameters and the second gives the 
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estimated ones. Both datasets are composed of 150 independent realizations of the price process on two-hour 
windows, and we choose simulation parameters close to what is found further for stock data in order to obtain 
relevant benchmarks. Note that Simulation 1 features a non-zero Brownian volatility, whereas Simulation 2 
is generated by the “pure” propagator model without noise. 


Year 

Simu. 

Calib. 

Marks type 

Volume 

Volume 

/^multi 

60/360 

60/360 

Wmulti 

0.100/0.900 

0.102/0.898 

^oomulti 

15.0 

15.2 

'(’s multi 

110.5/19.5 

109.8/20.9 

'(’c multi 

66.5/3.5 

59.7/9.7 

.^multi 

- 

3.1659 

^mono 

- 

153.0 

^oomono 

- 

16.6 

^smono 

- 

68.7/13.1 

^cmono 

- 

37.4/6.1 

c 

-^mono 

- 

3.1560 

BR 

0.833 

0.839 

DBR 

0.250 

0.257 


Year 

Simu. 

Calib. 

T^adj (sec) 

4 

4 

O^multi 

2.70 

2.35 

Pmulti 

60/360 

6/60/360 

Amulti 

0.50/0.10 

0.13/0.35/0.11 

J^multi 

0.40 

0.41 

C’multi 

0.1000 

0.1917 

„2 

' multi 

- 

9.554% 

Tmono 

- 

2.38 

Pmono 

- 

68.2 

'^mono 

- 

0.55 

^mono 

- 

0.1923 

mono 

- 

9.519% 


Table 2: Calibration of the resilience (left) and intensity (right) for Simulation 1. For the (j)^s, the first entry 
is the constant term and the second one is the linear term. 


Year 

Simu. 

Calib. 

Marks type 

Volume 

Volume 

/^multi 

120/360 

6/120/360 

Wmulti 

0.050/0.950 

0.0007/0.0505/0.9488 

^oomulti 

40.0 

39.1 

/•s multi 

84.0/36.0 

72.8/40.9 

/•c multi 

45.0/5.0 

47.4/7.7 

.^multi 

- 

2.7218 

/^mono 

- 

82.2 

^oomono 

- 

19.3 

^smono 

- 

27.3/15.4 

^cmono 

- 

17.8/2.9 

-^mono 

- 

2.6740 

BR 

0.519 

0.535 

DBR 

0.214 

0.186 


Year 

Simu. 

Calib. 

-^adj (S6c) 

2 

2 

O'multi 

- 

3.05 

Pmulti 

- 

6/120 

Amulti 

- 

0.0005/0.6850 

J^multi 

- 

0.31 

C’multi 

- 

0.0055 

r‘2 

' multi 

- 

96.92% 

Tmono 

3.20 

3.06 

Pmono 

130 

121.3 

'^mono 

0.70 

0.69 

<^mono 

0.0000 

0.0055 

j,‘2 

mono 

- 

96.92% 


Table 3: Calibration of the resilience (left) and intensity (right) for Simulation 2. For the (j)^s, the first entry 
is the constant term and the second one is the linear term. 

Overall, the accuracy of the estimation is satisfying. The estimated Hawkes parameters are very close 
to the real ones, although the dimensionality is high. Importantly, the branching ratios and directional 
branching ratios are all determined accurately, within a precision of ±0.03 on our experiments. Concerning 
the propagator, the results are more noisy for Simulation 1, which is not surprising since it includes some 
Brownian noise. Still, the proportion of transient impact is nearly exact and the dominant time scale is well 
determined. Simulation 2 is generated with a mono-exponential propagator, and the resilience speed pmono is 
slightly underestimated; however this parameter is less stable than the A’s and the accuracy that we obtain 
seems reasonable. In this second case, the values that we hnd for the volatility and the regression are 
satisfyingly close to 0 and 100 % respectively. 


17 







4.3 BNP Paribas 


Tables m [S] and [S] and Figures [U [5] and [3] present our estimation results for the French stock BNP Paribas on 
the periods February-September 2012 and January-September 2013. 


Ladj (sec) 

0 

2 

4 

6 

^Liti(2012) 

24.572% 

24.675% 

24.677% 

24.672% 

^Liti(2013) 

10.607% 

10.674% 

10.668% 

10.649% 


Table 4: Regression for the multi-exponential resilience curve, evaluated for several market adjustment 
lags Ladj = 0, 2,4 ,6 seconds, for the stock BNP Paribas. 


Marks type 

Unit 

Volume 

Price jump 

£„,ono(2012) 

2.6804 

2.6826 

2.6791 

£„,ono(2013) 

2.5772 

2.5794 

2.5750 


Table 5: Log-likelihood per point for the mono-exponential Hawkes model, evaluated for several types of 
marks: unit, volumes and price jumps (see eq. ((34ll L for the stock BNP Paribas. 


Year 

2012 

2013 

Marks type 

Volume 

Volume 

/Jmulti 

6/360 

6/360 

Wnmlti 

0.010/0.990 

0.011/0.989 

^cx)multi 

15.1 

12.1 

<(’s multi 

112.8/18.4 

115.4/15.7 

'/'cmulti 

50.4/2.1 

46.4/0.9 

•^inulti 

2.7720 

2.6708 

/^mono 

73.0 

114.1 

^cxDmono 

13.9 

14.0 

0smono 

38.3/6.2 

58.5/8.0 

^cmono 

17.1/0.7 

23.5/0.5 

r 

■^mono 

2.6826 

2.5794 

BR 

0.820 

0.810 

DBR 

0.351 

0.380 


Year 

2012 

2013 

Ladj (sec) 

4 

2 

O^multi 

2.69 

2.99 

Pmulti 

60 

60/360 

-^multi 

0.61 

0.30/0.53 

J^multi 

0.39 

0.17 

Cnmlti 

0.2253 

0.2153 

„2 

multi 

24.677% 

10.674% 

Tmono 

2.70 

2.56 

Pmono 

60.8 

116.5 

-^mono 

0.62 

0.80 

<^mono 

0.2253 

0.2153 

mono 

24.678% 

10.688% 


Table 6 : Calibration of the resilience (left) and intensity (right) for the stock BNP Paribas for the periods 
February-September 2012 and January-September 2013, between 11 a.m. and 1 p.m. For the (j>’s, the first 
entry is the constant term and the second one is the linear term. 


Let us first look at the estimation results for the propagator, 
lag Ladj defined in Section 13.3.11 is positive and thus that 
estimation yields Ladj = 4 sec. for 2012 and Ladj = 2 sec. 
longer on Figure 2(a) than on Figure |2(b)[ The parameter 
reached by the propagator at the end of the increasing phase 
means that on average, after a large trade, not only does the 
would yielcQ 7 Ri 2 ), but cancellations at the new best queue 
trade. 


Table m and Figure [5] show that the adjustment 
the propagator is increasing near zero. The 
for 2013, and the increasing part lasts indeed 
7 given in Table [5] tunes the maximum value 
We find a result between two and three. This 
bid-ask close around the impacted price (which 
also push the price in the same direction as the 


After its short increasing part, the propagator switches to its resilience mode described by Table [5] and 
Figure[T] The unconstrained resilience curve is quite smooth, and one can observe on Figure [T]that it decays 

^To be more precise, let us consider for example a buy order that increases the ask of one tick. Then, the midprice jumps of 
one half tick. If the bid price follows shortly the ask and increases of one tick, this moves again the mid of one half tick upward, 
which gives 7 = 2 . 
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Lag {min) 


Lag (min) 


(a) 2012 


(b) 2013 


Figure 1: The estimated propagator for BNP Paribas. The plain line is the unconstrained propagator, the 
(blue) dashed line is the mono-exponential resilience curve, and the (green) dot-dashed line is the multi¬ 
exponential resilience curve. 


to a non-zero proportion of permanent impact (~ 40% for 2012 and Ri 20% for 2013). Also, the results given 
in Table [S] indicate that the mono-exponential fit for the resilience is good on this dataset. For 2012, only the 
speed p = 60 (i.e. a characteristic time scale of one minute) is selected in the multi-exponential estimation. 
On the other hand, for 2013, there are two selected speeds (corresponding to one minute and ten seconds), 
but the mono-exponential fit with Pmono = 116.5 (approximately thirty seconds) yields a higher regression r^. 
These dominant characteristic time scales motivate the use of the particular case considered for the optimal 
strategy in Section [231 

We now focus on the estimation results for the Hawkes parameters. Table [5] justifies the selection of volume 
marks: indeed, they yield a higher likelihood per point than unit marks and price marks. Unit marks are 
the benchmark model for Hawkes processes, but they fail to reproduce the fact that large orders trigger 
more activity on the market. Indeed, we see on Table [5] that the self-excitation parameter (j)s and the 
cross-excitation parameter ipc have non-negligible linear parts (10 — 15% for self-excitation and 2 — 5% for 
cross-excitation). As for price marks, we think that they give less information than volume marks since price 
jumps cluster on a few values (one or two ticks in most cases), while the distribution of volumes is much 
wider. 

Hawkes parameters seem to be quite stable, especially in the multi-exponential case where the estimation 
results are very similar for 2012 and 2013. Two decay speeds are selected for the intensity, and these are 
the two extreme ones: the long range /3 = 6 (10 minutes) and the short range (3 = 360 (10 seconds). The 
importance of each time scale j3i can be measured by the proportion of the norm that it accounts for, given 
by 

Wi/I3i 

Here, the long-range component (3 = 6 accounts for r; 40% of the norm, and the short-range one (3 = 360 
for the remaining 60%. Therefore, both decay speeds are important, which is also reflected by the significant 
increase from the log-likelihood per point Umono of the mono-exponential model to Unmiti for the multi¬ 
exponential one. One can deduce that contrary to the propagator, the Hawkes kernel includes at least two 
exponential components. 

Figure |3| gives a visual comparison between the data, the mono-exponential Hawkes model and the multi- 
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(a) 2012 (b) 2013 

Figure 2: Zoom on the first twenty seconds of the propagator curve for BNP Paribas. The plain line is the 
unconstrained curve, the (blue) dashed line is the mono-exponential curve, and the (green) dot-dashed line 
is the multi-exponential curve. The propagator is increasing during a few seconds, before the resilience effect 
kicks in. 


exponential one through the auto-correlation of the number of events. The formula for the empirical auto¬ 
correlation function ^(fc) is given by equation (1551) . Using equations (1151) and (1151) . we have that if h > 0 is 
small and r > 0, ’^{r/h) approximates the auto-correlation function ^(t)/'^(0 ) of the total intensity process 
Et. For a multi-exponential Hawkes kernel, one has 


€{T/h) 


nr) 

<^( 0 ) 


Y^^exp(-6j|r|), 


where the coefficients oi, • • • , ap, bi, ■ ■ ■ ,bp > 0 are determined as in Proposition l2.ll One can see on Figure|5] 
that the mono-exponential model fits the end of the curve rather well but that its initial decay is too slow. On 
the other hand, the multi-exponential model does show a transition between two decay speeds, and captures 
the short-range behavior of the curve better. Still, the accuracy of the fit is not very satisfactory and it seems 
that the functional form of the auto-correlation is more subtle than a multi-exponential one. 

Finally, we confront our calibration results to the conditions derived in Section [5151 for the absence of Price 
Manipulation Strategies in the model. It is complicated in practice to quantify the deviation of our set of 
parameters to the equilibrium using equation (1161) . On the other hand, equation (1181) gives a simpler criterion: 
the directional branching ratio DBR and the proportion 1 — i/oi transient impact should be equal for PMS to 
be ruled out. Here, the standard branching ratio BR « 80% is high, but the directional branching ratio DBR 
~ 40% is quite low, which is due to a non-negligible part of cross-excitation in the order flow. It implies that 
the equilibrium condition is violated since l — 60% for 2012 and I — vk, 80% for 2013. Since l — v> DBR 

holds in both cases, we find that the price process is mean-reverting on average, rather than diffusive. This 
should lead to the existence of PMS in practice, which is the object of Section [5] 


4.4 Total 

Tables [3 island [5] present our estimation results for the French stock Total on the periods January-September 
2012 and January-September 2013. 

The qualitative interpretation of the results is similar to that of Section 15751 Yet, one should note the following 
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(a) 2012 (b) 2013 

Figure 3: Auto-correlation function of the number of midpoint moves triggered by trades (plain line), in log- 
log scale, for BNP Paribas. The (blue) dashed line is the auto-correlation generated by the mono-exponential 
Hawkes model, the (green) dot-dashed line is generated by the multi-exponential Hawkes model. 


Aadj (sec) 

0 

2 

4 

6 

^Liti(2012) 

23.093% 

23.166% 

23.137% 

23.108% 

^Liti(2013) 

11.604% 

11.613% 

11.608% 

11.606% 


Table 7: Regression for the multi-exponential resilience curve, evaluated for several market adjustment 
lags Ladj = 0, 2,4,6 seconds, for the stock Total. 


points that are observable on Table [H First, we notice that there is no significant difference between the 
mono and multi-exponential propagator. Here, contrary to the BNP Paribas case, the fit is slightly better 
with two time scales. Second, the branching ratio BR Ri 60% and the directional branching ratio DBR« 30% 
are smaller for Total, whereas the proportion 1 — i/ oi transient impact (84% for 2012 and 92% for 2013) is 
higher, which means that the price has an even stronger mean-reversion tendency. 
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Marks type 

Unit 

Volume 

Price jump 

£„,o„o(2012) 

2.2981 

2.3034 

2.2965 

£„,o„o(2013) 

2.2065 

2.2127 

2.2063 


Table 8: Log-likelihood per point for the mono-exponential Hawkes model, evaluated for several types of 
marks: unit, volumes and price jumps (see eq. (IMl)'). for the stock Total. 


Year 

2012 

2013 

Ladj (sec) 

2 

2 

O^multi 

3.72 

2.21 

Pmulti 

60/360 

6/120/360 

Amulti 

0.29/0.55 

0.004/0.651/0.268 

^'multi 

0.16 

0.08 

CTmulti 

0.1400 

0.1124 

' multi 

23.166% 

11.613% 

’Tmono 

3.84 

2.65 

Pmono 

187.2 

191.3 

Amono 

0.84 

0.93 

<^mono 

0.1399 

0.1123 

mono 

23.132% 

11.586% 


Year 

2012 

2013 

Marks type 

Volume 

Volume 

/3multi 

120/360 

6/60/360 

Wmulti 

0.052/0.948 

0.010^.035/0.955 

^oomulti 

21.0 

9.7 

/•s multi 

98.7/21.7 

84.5/18.5 

/'c multi 

44.3/3.9 

36.5/0.7 

•^multi 

2.3801 

2.2842 

/^mono 

93.0 

109.1 

^oomono 

9.2 

9.0 

mono 

43.5/9.6 

47.4/10.4 

^cmono 

19.5/1.7 

20.4/0.4 

r 

■^mono 

2.3034 

2.2127 

BR 

0.517 

0.688 

DBR 

0.222 

0.323 


Table 9: Calibration of the resilience (left) and intensity (right) for the stock Total for the periods January- 
September 2012 and January-September 2013, between 11 a.m. and 1 p.m. For the (/)’s, the hrst entry is the 
constant term and the second one is the linear term. 
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5 Test of some Price Manipulation Strategies 


In this section, we apply the optimal strategy derived in [Tj and Theorem 12.21 to our dataset, with the 
parameters obtained by our calibration protocol. Essentially, we run the strategy each day with a zero initial 
and final position. If the model is relevant, this should give some profit on average. This backtest serves as 
a practical evaluation of our calibration results, and of the model itself. 


5.1 Scaling and discretization of the optimal strategy 


The simplest and most natural way is to use the optimal strategy (I23II is to consider a discrete subset 0 of 
[0,T] (possibly made of stopping times) and to trade for each time t £ 0 the quantity 


Ct.T — ~ 


[l + p{T-t)\qsDt + Xt 
2 + p(T - t) 

( 1 ) • • • ) 1 ) ■ + 


TOi 

+ - X 

2p 


piT-t) 

2 + p{T- t) 


X [C((r - t)H) + vp{T - t) w((r - t)H)] \ . s6, 


(36) 


so that (|23|) holds in if s = 1. Here, 6t 
calculate D by using the following formula 




is the vector of intensity imbalances and we 


= [G(t-r)-G(oo)]. 

T<t 


In order to tune the leveraging of the strategy and its discreteness on the market, we introduce a scaling 
factor s £ [0,1] that multiplies 5t and Dt. By doing so, we multiply by s the deviation of the whole strategy 
from the standard Obizhaeva and Wang [23] liquidation scheme. The latter is static since it assumes that 
the observed price process is always a martingale. The limit s = 0 thus corresponds to the static strategy, 
whereas s = 1 is the optimal strategy given by Theorem l2.2l which may be very aggressive in standard market 
conditions. In fact, using the optimal strategy with s = 1 may lead to buy and sell repeatedly quantities 
that exceed the size of the first queues, which is not realistic. 


5.2 Methodology 

To backtest the strategy in practice, we choose to update our position when we observe midprice moves. Let 
us define 

0 = {0 £ (Arw, T), 0 — t{0) > Ladj}, 

where the 0’s correspond to the times of price jumps due to cancellations and passive limit orders, t{6) is 
the time of the last price jump due to a trade before 0, Arw is the regression window defined in Section [3.31 
and Ladj is the market adjustment lag. The position of the strategy at time t £ [0,T] is given by 

ege 

At time T, we close the position with the transaction 

AX^ = -X^. 

The time horizon is still T = 2 hours, where t = 0 corresponds to 11 a.m. and t = T to 1 p.m. We choose to 
apply the strategy on [Arwj T] instead of [0, T], so that the values of 6t and Dt for t > Arw can be accurately 
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computed. Moreover, for each time r G (ArW)T') where the price jumps because of a transaction, we do 
not trade on the time interval [r, r + iadj]- As a matter of fact, the market adjustment lag Ladj corresponds 
approximately to the time needed for the bid-ask to close after a trade that empties the best bid or the 
best ask. It would be meaningless to trade at the midprice (or even at the midprice ±1 half-tick) before the 
bid-ask is closed, and we would artificially boost the performance of the strategy if we allowed it. However, 
this constraint is not needed for simulated data, for which we set 0 = {0 G (Arwj’T)}. 

We assume that the scaling s is small enough for the effective impact of the strategy on the market price to 
be negligible. Although approximative, this assumption allows us to backtest the strategy assuming that we 
can trade at the observed price. 

In the sequel of this section, we apply the optimal strategy for the mono and multi-exponential Hawkes decay 
kernels and for several stocks. We summarize the results in one table and a few graphs for each stock. We 
note Yi is the profit made by the strategy on day i G {1, • ‘ ‘ jIT-}, Yn — ^ empirical mean and 

Sn = ~ ^n]'^ the empirical variance of daily profits. The values given in the table are 

• The annualized Sharpe ratio of the strategy 

Sharpe = \/u x 

• The empirical positivity probability, skew and kurtosis of daily gains 

1 " iW" [y. _^13 

Proba. = — ^ Skew = — - ^ Kurto 

The choice of the scaling s has no impact on these results, since all the values above are invariant to the 
multiplication of the strategy by a positive constant. Thus, only the units of the graphs are changed by the 
scaling, and we fix s = 0.001. With this choice, the volumes of individual transactions never exceed 5% of 
the average volume of the best bid/ask queue, which makes our toy backtest with no impact reasonable. 

For each stock and each period, we also evaluate of the “Poisson strategy” that one obtains if trades are 
modeled with two independent compound Poisson processes, which is equivalent to imposing = k, Kj” = k 
and thus = 0. More precisely, we trade for t G 0 (the same time grid as for the Hawkes model) the 

quantity 

[1 + p{T - t)]qsDt + Xt 
2 + p{T-t) 

This strategy is entirely based on mean-reversion, and the trend-following part disappears. For the Hawkes 
and the Poisson strategies, we give in the tables the impact of a bid-ask cost of one half-tick on the results. 
This corresponds to a more realistic implementation of the strategy (which should trade at the best and not 
at the midpoint) and we see that this is sufficient to prevent Price Manipulation Strategies in most cases (the 
Sharpe ratio becomes close to zero or even negative). As a benchmark, we also present in Table [TUI and ITT] 
the results of these strategies on simulated data. These give an idea of the profits that the strategies could 
reach in theory. 

Our findings are the following. On simulated data, the profits made by the strategies are evident and still 
significant with a half-tick penalty. On real data, the Sharpe ratios remain positive for all the tests, which 
indicates that the model is not out of scope and captures some characteristics of the real market flow. 
However, these ratios are lower than for simulated data and may become negative when we take the bid- 
ask spread into account. Said differently, market participants who use mean-reverting and trend-following 
strategies already exploit most of the arbitrage opportunities described by our model, and the backtest of our 
optimal strategy in realistic market conditions does not yield significant gains. Somehow, this justifies the 
theoretical assumption to consider a market without PMS when dealing with both market impact and the 


Si 
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bid-ask spread. Now, let us compare the different strategies used in Tables \TI\ and [T51 The results are rather 
similar for the three strategies and none of them seem to outperform the others. Intuitively, this means that 
the main component of the strategy is the mean-reverting one (which is common to the Poisson and Hawkes 
strategies), while the trend-following one has a minor contribution. This is confirmed by the statistical facts 
in Table |6] and |9] where the directional branching ratio DBR is much lower than the proportion of transient 
impact Amono = I - ly. 


5.3 Simulated data 

Tables 1101 and HD present the results of the optimal strategy applied to simulated data. The simulation 
parameters are the same as in Section 14.21 (see Tables [5] and [3]), and both datasets are composed of 150 
independent two-hour windows. In Tables ITUl and [TTl the two first columns contain the results of the strategy 
computed with the real simulation parameters for the Hawkes model, and the third and fourth columns 
contain the results for estimated Hawkes parameters. In both cases, the resilience is the estimated mono¬ 
exponential curve, since the optimal strategy is known explicitly only in that case. 


Year 

Simu. 

-|-bid-ask 

Calib. 

-|-bid-ask 

Sharpe (Multi) 

6.759 

3.225 

6.764 

3.176 

Proba. (Multi) 

74.0% 

63.3% 

74.0% 

63.3% 

Skew (Multi) 

0.55 

0.23 

0.57 

0.24 

Kurtosis (Multi) 

4.19 

4.03 

4.22 

4.05 

Sharpe (Mono) 

— 

— 

6.308 

3.371 

Proba. (Mono) 

- 

- 

74.0% 

62.7% 

Skew (Mono) 

- 

- 

0.47 

0.20 

Kurto. (Mono) 

- 

- 

4.11 

3.97 

Sharpe (Poisson) 

- 

- 

6.630 

3.735 

Proba. (Poisson) 

- 

- 

73.3% 

64.0% 

Skew (Poisson) 

- 

- 

0.43 

0.18 

Kurto. (Poisson) 

- 

- 

3.88 

3.80 


Table 10: Results statistics of the optimal strategy applied on the data of Simulation 1 (simulation parameters 
of Table E]). 


Year 

Simu. 

-|-bid-ask 

Calib. 

-I-bid-ask 

Sharpe (Multi) 

33.268 

27.095 

32.302 

25.769 

Proba. (Multi) 

100 .0% 

100 .0% 

100 .0% 

99.3% 

Skew (Multi) 

0.50 

0.51 

0.52 

0.54 

Kurtosis (Multi) 

3.22 

3.35 

3.25 

3.40 

Sharpe (Mono) 

— 

— 

34.940 

28.605 

Proba. (Mono) 

- 

- 

100 .0% 

100 .0% 

Skew (Mono) 

— 

— 

0.45 

0.46 

Kurto. (Mono) 

- 

- 

3.19 

3.31 

Sharpe (Poisson) 

— 

— 

34.986 

28.681 

Proba. (Poisson) 

- 

- 

100 .0% 

100 .0% 

Skew (Poisson) 

- 

- 

0.44 

0.45 

Kurto. (Poisson) 

- 

- 

3.12 

3.25 


Table 11: Results statistics of the optimal strategy applied on the data of Simulation 2 (simulation parameters 
of Table [3|). 
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5.4 BNP Paribas 


Year 

IS 2012 

-|-bid-ask 

IS 2013 

-I-bid-ask 

OS 2013 

-fbid-ask 

Sharpe (Multi) 

1.382 

-0.675 

2.454 

0.725 

2.248 

0.418 

Proba. (Multi) 

65.9% 

56.5% 

61.3% 

47.1% 

58.1% 

48.2% 

Skew (Multi) 

-2.02 

-2.40 

3.65 

3.34 

4.48 

4.14 

Kurtosis (Multi) 

19.02 

19.94 

29.40 

27.71 

36.96 

34.65 

Sharpe (Mono) 

1.263 

-0.713 

2.536 

0.771 

2.430 

0.563 

Proba. (Mono) 

62.9% 

57.1% 

62.3% 

48.2% 

58.1% 

49.7% 

Skew (Mono) 

-1.89 

-2.30 

2.94 

2.61 

3.56 

3.21 

Kurto. (Mono) 

16.64 

17.68 

23.27 

21.90 

26.74 

24.87 

Sharpe (Poisson) 

1.056 

-0.849 

2.5888 

0.8077 

2.513 

0.630 

Proba. (Poisson) 

65.3% 

55.9% 

61.3% 

49.7% 

60.2% 

49.2% 

Skew (Poisson) 

-2.72 

-3.07 

3.09 

2.76 

3.94 

3.58 

Kurto. (Poisson) 

23.46 

24.68 

24.41 

22.82 

31.13 

28.86 


Table 12: Results statistics of the optimal strategy applied on BNP Paribas on the periods February- 
September 2012 and January-September 2013, every day between 11.30a.m. and 1p.m. The first two columns 
are In-Sample results, i.e. the data used to calibrate the model is the same as the evaluation data. The third 
column gives Out-of-Sample results, i.e. we calibrate the model on the 2012 data to apply the strategy on 
the 2013 data. 



0 50 100 150 



Days 




(a) Trading at the midprice 


(b) One half-tick penalty 


Figure 4: Cumulated gains of the strategy applied on BNP Paribas on the period February-September 2012, 
every day between 11.30a.m. and 1p.m. The (red) long-dashed line is the performance of the Poisson model, 
the (blue) dashed line is the mono-exponential Hawkes model, and the (green) dot-dashed line is the multi¬ 
exponential Hawkes model. Left: we allow the strategy to trade at the midprice. Right: we apply a posteriori 
a linear cost penalty of one half-tick to account for the bid-ask spread. 
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Histogram of gains 


Histogram of gains 



I-1-1-1 

-100 -50 0 50 



-150 -100 -50 0 50 


Daily gains of the strategy 


Daily gains of the strategy 


(a) Mono-exponential Hawkes 


(b) Multi-exponential Hawkes 


Figure 5: Histogram of the daily gains of the strategy applied on BNP Paribas on the period February- 
September 2012, between 11.30a.m. and 1p.m. Left: Mono-exponential Hawkes model. Right: Multi¬ 
exponential Hawkes model. 



Days 




(a) Trading at the midprice (b) One half-tick penalty 

Figure 6: Cumulated gains of the strategy applied on BNP Paribas on the period January-September 2013, 
every day between 11.30a.m. and 1p.m. The (red) long-dashed line is the performance of the Poisson model, 
the (blue) dashed line is the mono-exponential Hawkes model, and the (green) dot-dashed line is the multi¬ 
exponential Hawkes model. Left: we allow the strategy to trade at the midprice. Right: we apply a posteriori 
a linear cost penalty of one half-tick to account for the bid-ask spread. 
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Histogram of gains 


Histogram of gains 




1 1 1 1 1 

-50 0 50 100 150 

Daily gains of the strategy 

-100 

-50 0 50 100 150 

Daily gains of the strategy 

200 

(a) Mono-exponential Hawkes 


(b) Multi-exponential Hawkes 



Figure 7: Histogram of the daily gains of the strategy applied on BNP Paribas on the period January- 
September 2013, between 11.30a.m. and 1p.m. Left: Mono-exponential Hawkes model. Right: Multi¬ 
exponential Hawkes model. 
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5.5 Total 


Year 

IS 2012 

-|-bid-ask 

IS 2013 

-I-bid-ask 

OS 2013 

-fbid-ask 

Sharpe (Multi) 

0.067 

-0.763 

2.697 

1.016 

2.794 

1.224 

Proba. (Multi) 

57.8% 

44.3% 

66.0% 

51.8% 

65.4% 

51.8% 

Skew (Multi) 

-9.34 

-9.62 

6.38 

6.37 

5.94 

5.97 

Kurtosis (Multi) 

114.76 

117.75 

62.86 

65.85 

53.84 

57.93 

Sharpe (Mono) 

0.126 

-0.770 

2.795 

1.191 

2.760 

1.099 

Proba. (Mono) 

59.4% 

44.8% 

66.0% 

52.4% 

65.4% 

52.4% 

Skew (Mono) 

-9.52 

-9.82 

6.01 

6.02 

6.18 

6.18 

Kurto. (Mono) 

118.29 

121.77 

55.54 

59.30 

59.20 

62.65 

Sharpe (Poisson) 

0.001 

-0.810 

2.807 

1.259 

2.790 

1.224 

Proba. (Poisson) 

57.8% 

43.8% 

65.4% 

50.8% 

65.4% 

50.8% 

Skew (Poisson) 

-9.33 

-9.59 

5.96 

6.00 

6.04 

6.08 

Kurto. (Poisson) 

114.39 

116.97 

53.37 

57.35 

54.90 

58.87 


Table 13: Results statistics of the optimal strategy applied on Total on the period January-September 2012- 
2013, every day between 11.30a.m. and 1p.m. The first two columns are In-Sample results, i.e. the data used 
to calibrate the model is the same as the evaluation data. The third column gives Out-of-Sample results, i.e. 
we calibrate the model on the 2012 data to apply the strategy on the 2013 data. 
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6 Conclusion 


In this paper we extend the theoretical model of [T] by allowing more general forms for the propagator and 
the Hawkes kernel. Moreover, we derive the conditions that exclude Price Manipulation Strategies in the 
sense of Huberman and Stanzl |20| in the case where both the propagator and the Hawkes part have a multi¬ 
exponential decay. This allows us to deduce some interesting links between the propagator and the Hawkes 
kernel for general completely monotone kernels. Besides, when the price propagator is mono-exponential and 
the Hawkes kernel is multi-exponential, we can still obtain the optimal strategy as a closed formula. This 
has some practical interest since the propagator seems to be well approximated by an exponential, while the 
Hawkes decay kernel clearly includes several characteristic time scales. 

We also introduce a calibration protocol for the model, that we apply to tick-by-tick data from French 
stocks. The results show that the model explains a significant part of the variance of prices. The long-range 
propagator is a smoothly decaying curve, but the short-range part is increasing during a few seconds (which we 
think corresponds to the time that the bid-ask needs to close after a large trade). Concerning the estimation 
of the Hawkes process modeling the flow of trades, we obtain excitation parameters that significantly differ 
from zero, which shows in particular that the flow is not Poissonian. Also, we find that the main driver of 
the excitation between trades is volumes rather than price moves. The martingale conditions that prevent 
PMS are violated in practice, in particular the directional branching ratio is smaller than the proportion of 
transient price impact. Therefore, in our dataset, the price has a notable mean-reverting tendency. 

A series of backtests shows that the optimal strategy used for round trips is profitable on average, therefore 
the model does offer a relevant prediction for midprice moves. However, a level of transaction costs compatible 
with the width of the bid-ask spread makes the profits close to zero. This confirms the natural idea that the 
absence of Price Manipulation Strategies at this frequency stems from both market impact and bid-ask costs. 

We eventually draw some applications and perspectives on our study. A first straightforward application is to 
use the calibrated model for optimal execution, by using the block trades (|36l) on a given (possibly random) 
time grid 0. Contrary to most existing models, this strategy takes the flow of trades into account. Another 
possible use of this model is to detect the instants when it is interesting to trade. In fact, equation (j46p gives 
the (theoretical) instantaneous cost of non-trading. One may decide to trade for example only if this cost 
is above some threshold, or optimize the trade-off between this cost and transaction costs. Such strategies 
could be interesting in practice, but need to be thoroughly investigated on market data. Let us now consider 
some possible extensions of our work. First, it would be interesting to handle a calibration of the model on 
an entire day instead of a two-hour window. This is certainly difficult due to intra-day variations of trading 
activity between the open and the close. Second, it would be nice to incorporate in our model transaction 
costs such as the bid-ask spread. A less ambitious goal would be at least to modify our optimal execution 
strategy to reduce transaction costs in a clever way, maybe by using equation (1461) as mentioned above. 
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A Estimation of the propagator using Newton-Raphson’s algorithm 

As explained in Section 13.31 we resort to Newton-Raphson’s algorithm to minimize the quadratic error 


£iG) = ^ [A - P0? 

which quantifies the distance between the observed midpoint price Pt and the predicted price 

Pt = Pt-Anw + AMt G(t — T). 

t — 

Let us assume that tt G R*, Z > 1, is a parameterization of G, i.e. G = G(7r) is determined by tt, and so is 
the error £{G) = £{t^)- For a starting point tto, the principle of the algorithm is to approximate G by the 
sequence G(7r„) such that 


VneN, TTn+i = TTn - [V^f(7r„)] \vf(7r„) 

where Vf(7r) G R* is the gradient of the error £ and (tt) G is its Hessian matrix, w.r.t. the parameter 
TT. The convergence of the method is only guaranteed if the starting point ttq is “good enough”, and if V^£(7r„) 
is positive definite for all n G N. 

To apply this method, one needs to compute the gradient V£{tt) and the Hessian matrix V'^£{tt) of the error 
£ for each parameterization tt of G. One has 

V£{tt) = 2 ^ [Aw - Pe] X VAW, 

= 2 ^ I [Aw - Pe] X vWeW + vAW- (vAw)^l • 

^RW 

The problem boils down to computing VA(7'‘) and V’^Peirr), which can themselves be expressed as 

VA W = VG{t - t), 

t—ARw<’7‘<i 

vWeW= Y. AMrV^G{t-T), 

t — 

where we drop the dependency of G in tt for clarity. Therefore, only the gradient VG and the Hessian V^G 
of the estimated propagator G need to be specifically derived for each parameterization, which is the object 
of the sequel. 

An important particular case is when G is linear w.r.t. tt. In that case, V^G = 0, thus V'^Peirr) = 0 and 

v2fw = 2 ^ vAW-(vAw)^ 

is positive definite for any tt. Also, in that case, VG does not depend on the current values of the parameter 
TT, and 

TTl = TTo - [V^fW)] ^V^W) 

is the minimizer of the error £{'k) for any ttq. Therefore, when the propagator is parameterized linearly, the 
starting point of the algorithm has no importance and one step is enough to find the optimum. 
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A.l Unconstrained propagator 

We consider the unconstrained propagator 


Git) — 5;l[t,,ARw[(^) + 


i=0 


[tj+i - t)gi + (t - ti)gi+i 
ti+\ ti 




with I > 2, 0 = to < ti < ■ ■ ■ < ti fixed discretization times, = 1 and tt = (gi, ■ ■ ■ ,gi) G [0, +oo)* the 
/-dimensional parameter to estimate. The dependence of G w.r.t. tt is linear, and we only need to compute 
the gradient: 


9G(t) _ ti+i -t ^ “ ^*-1 -n u\ r 1 ^ ; i 

- 7 —j~Mti,U+ili^) + 7“37 for 1 < z < / - 1, 

dG{t) _ (f\ \ 1 (A 

- 1 [P.Arw[W + 


A.2 Multi-exponential curve 

In this section we consider the multi-exponential resilience curve 

p 


R(t) = u + exp(-pit), 




and the propagator 


G(f) = 


l + (E(Ladj)-l) 


^adj J 




determined by R for Tadj > 0 fixed a priori. The dependence of G is linear w.r.t. the parameters if and only 
if the Pi’s are fixed. 


A.2.1 Unit Multi-exponential curve 

The “unit” multi-exponential resilience curve is the case where v = \ — X]r=i imposed. This yields 

p 

Rit) = 1 - y^Ai(l - exp(-pit)). 


and the parameter tt = (Ai,--- ,Ap,pi,--- , Pp) is 2p-dimensional. One has for i,j G ;P}) 


dRjt) 

dXi 


{1 - exp(-pjt)}, 


d^Rit) 

dpi 

d^Rjt) 

dXidXj 


= Xi exp(-pjt). 


= 0 , 


dkjt) _ 

dpi 

d^Rjt) 

dptdXi 


-t Aiexp(-pjt), 
= -t exp(-pjf). 


d’^Rjt) _ d’^Rjt) 

^p^^pj ’ dXidpj 


if i ^ j. 
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A.2.2 General Multi-exponential curve 


If we relax the condition v = 1 — that R{0) can be greater than unity, we obtain 

p 

R{t) = 77 + exp(-pit), 

with F > 0, Ai > 0. The parameter tt = (F, Ai, • • • , Ap, pi, - ■ ■ , Pp) is then {2p + l)-dimensional. The gradient 
and Hessian are given by 

dm _ , 

dv '' 


d^m _ d^m _ d^m _ 

’ dPdK ’ dpdp^ ’ 


dR{t) 

dXi 


exp{-pit), 


dm 

dpi 


-t Xi exp(-pit), 


d^m 

dpi 


R Aiexp(-pit), 


d'^m 

dXidXj 


d^m 

dpidXi 


t exp(-pjt), 


d^m _ ^ d^m 

dpidpj ’ dXidpj 


if i ^ j. 


B Maximum Likelihood Estimation for the Hawkes intensity 


The estimation of the Hawkes parameters, as presented in Section 13.41 resorts to Maximum Likelihood 
Estimation. The use of the MLE for Hawkes processes is well known, see for instance Ozaki and has 
been recently considered by Da Eonseca and Zaatour |12| in a similar financial framework. In this section, 
we give the formula of the log-likelihood for Hawkes processes, and we derive its gradient and Hessian matrix 
which are necessary to use Newton-Raphson’s algorithm. 

We define the jump processes J+ = Eo<T<t l{AVt>o} and = Eo<r<t l{AArt<o}, i-e- (resp. J") 
makes a unit jump when N'^ (resp. N~) jumps. Say that we observe the realization of the process on the 
time interval [0, T], and that we want to maximize its log-likelihood on [to, T], with to G [0, T). Conditionally 
to (K^)te[o.T], the log-likelihood of a trajectory {Jt)t&[to,T] on the time interval [to,T] is (see [13], Section 
HI Proposition 7.2) 

ln£(J=t|«:=t) = f \n{Kf_)dJt - [ nf dt + T. (37) 

Jto Jto 

Moreover, conditionally to (k)^, K)~)te[o.T], the global log-likelihood of the model is 

ln/:(J|K) =ln/:(J+|K+)-Hln/:(J-|/c-). (38) 

We now compute ln£(J+|K+). Since we do not know the history of the process before time t = 0, it is 
impossible to compute exactly using equation ([Sj) since it requires to know all the jumps. However, a 
reasonable approximation is to choose to G (0,2^) such that 

Vu > to, K{u) < 1, 


which yields 

i4 ~ Koo + X! [l{AVt>o}<Ps(AAt/mi) -H l{AArt<o}<Pc(-AAt/TOi)] (39) 

0<T<i 

for t S [tojT]. Let us assume in the sequel of this section that to is such that (13^ can be considered as an 
equality. 
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We define tq = 0 and r^, i > 1 the ordered combined jump times of -/V+ and N on [0,T], and x(t) = 
max{i > 0,Ti < i] for t G [0,T]. We also define for z > 1 

= ips{AN+Jmi)k:l + ipciAN~Jmi)k~, 


where k^ = 1 if is a jump time of iV+, k^ = 0 otherwise, and k^ is defined similarly with N . One has 
for t G [to, T] 

xW 

K+ = /Coo + ^ 6 »+iG(t - Tj). 

J=i 

Distinguishing the jumps before and after to, we get 



x(to) 

Koo(T-to) + ^ 0+[^(T-r,)-^(to-D)] + 


i=i 


x(T) 

i=x(*o)+i 


(40) 


where ^ is the antiderivative of K. Let us turn to the other term of the log-likelihood. We set Af = 0 and 
for z > 2 

i-l 

j=l 


and we have 



X{T) 

E A:+ln(/Coo+ ^+). 

i=x(to) + l 


(41) 


We have the explicit expression of the log-likelihood ln£(J+|/c+) from (1551) . (1571) . (HDl) and ((IT]) . Thus, it can 
be evaluated on a discrete set of points, for instance to estimate one or several parameters with a grid search. 
Now, to maximize the likelihood using Newton-Raphson’s algorithm, one must also determine the gradient 
and Hessian matrix of ln£( J+j/c’*'). 


For given parameterizations of ips,<fc and K, we note tt an arbitrary parameter, and we have 


i91n£(J+|/c+) 

^/Coo 

ain£(J+|/c+) 

dn 


kt 


x(T) 

E 

*=x(to) + l * 

X{T) 7 -(-a A + 

■ O 

*=x(to) + l * 

x(to) 

- E - =^(^0 - ^o)]} 

1=1 


X(T) 

l=x(*o) + l 


which yields the gradient of the log-likelihood. For the Hessian matrix, let us note tt, tt' two parameters 
(distinct or not) of (ps,<Pc or K. We have 


\nC{J+\K+) 

dulo 

ln£(J+|/c+) 

dirdn' 


X{T) 

- E 

*=x(to) + l 
X(T) 

E 

i=x(to) + l 


oo+A+F’ 

aL-A+ 


+ At 


a^ ln£(J+|/c''‘) 

dKoodir 

d^Af d^,A+ \ 

[«oo+A+F J 


x(to) 

- E - A - Kito - d)]} 


1=1 


X(T) 

^ I 

*=x(io) + l 


ktdAt 

Koo + A'kY 


X{T) 

- E diAotmT-A-m)]}- 

l=x(to) + l 
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As soon as is known and are twice differentiable w.r.t. the parameterization, it is straightfor¬ 

ward to deduce the analytical expressions of the gradient and Hessian matrix of the log-likelihood from the 
preceding equations. 


C Optimal execution with a multi-exponential Hawkes kernel 


C.l Proof of Theorem 12.11 


First, let us remark that E WtdXt — WtXt = 0, and we can assume without loss of generality that 
(7 = 0. We decompose the price process as follows. We introduce dS^ = ^dNt, dD^’^ = —piD^’^dt+ ^dNt, 
dSf = ^dXt and dZff= -p.of'^dt + ^dXt, with = So, = D*, = 0. We have 


Pt = Pf + Pf, with Pf = Pf Pt^ = DfP 

Then, we can write the cost m as 

C{X)= [ P^dX,,-P^XT + C{X), 

J[0,T) 

where C(X) = C j,, P,^dXu +-^ — Pt ^ Xt- We note that C{X) is a deterministic 

function of X and is precisely the cost function considered in 0. Besides, it satisfies C{cX) = c^C{X) for 
c £ M. By the same argument as in the proof of Theorem 2.1 in [T], we get that there is no PMS if, and only 
if Pt is a (Pt)-martingale when Xt = 0 for any t. 

We now consider that X = 0 and write the martingale condition for P under the Hawkes model ([9|), (TTOll 
and (ITTIl . We have 


dPt = dSt -I- dPt -I- a dWt 


1 

9 


p 

dNt — ^ PiDt dt + a dWt 

i=l 


1 

q 


p 

dNt + (7dWt+ dtJ2^l 

i=l 


where 


q 


Sl = 




and Nt = Nt — mi Sudu is a martingale. Then, (Pt) is a martingale if and only if almost surely and 
dt-almost everywhere, X]r=i ~ We have 


dAl 


-piA\ dt -I- —Wi dit - dNt, 

q q 


with 



(pc)(dN+/mi) - ((/Js 


‘(5c)(dX„ /mi)] . 


(42) 


In particular, dA) = —piAldt between two consecutive jumps r and r' of N. Therefore, we have J2i=i A) = 
A\.e~Pd*-A = 0 for t £ and therefore A). = 0 for all i (the equality for t = r -|- k(T' — T)/p,k £ 

{0, ... ,p — 1} gives a Vandermonde system). Thus, we necessarily have = 0 for t > 0 for any i. Then, 
dA) = 0 gives 


[(tps - (pc)(dX/‘/mi) - ((/?s - Pc)idNt /mi)] 


^ [d7V+ - dN-] 


37 






for all f > 0 and all * G {!,•■■ tP}- Thus, ips — ipc niust be linear on the support of the law /i of the jumps 
of N^, and besides, we must have Vi, (tg — Lc)wi = XiPi- This precisely gives m- Conversely, it is clear 
that m ensures that P is a martingale by the same calculations. 


C.2 Proof of Theorem 12.21 


As in Section (IC.lll . we assume without loss of generality that cr = 0. We first introduce some notations to 
e main results on the c 
From (IH), (fT0)l and (l20ll . we have 


present the main results on the optimal execution. We define SI = and EJ = 




AS] = -/3j 51 dt + Wi dit , dE* = -fSi (E* - 2koo/p) dt + Wi d/*, (43) 

for all i G {1, • • • ,p}, where It = [((/3s + Pc){dN:^/mi) + + Pc)[dN~/mi)] and It is defined by dH]). 

We now proceed exactly as in [T], Appendix B, and only give here the main lines and use similar notations. We 
assume without loss of generality g = 1. For t G [0, T], a;, d, z G K and d, E G W, we denote by C{t, x, d, z, <5, E) 
the minimal cost to liquidate Xt = x over the time interval [t,r] when Dt = d, St = z, St = S and Et = E. 
We look for a function that has the following form 

C{t,x,d,z,5,T.) = a{T — t){d — {1 — i')x)^ + -{z — ux)^ + {d — {1 — iz)x){z — i^x) — -— 

p p p 

+ (d - (1 - iy)x) ^ -t) 5^ + ^ ^ Cjj (T - t) SiSj 

i—1 i—1i—1 

P 

+ ^e,(r-f) E, + g(r-t), (44) 

i=l 


with a, bi, Cij, Ci, g : R+ —>■ K continuously differentiable functions. We have the limit condition C(T, x, d, z, d, E) 
— (d + z)a: + x'^/2 = i(d + z — x)'^ — (d + z)^/2, which is the cost of a trade of signed volume —x. We thus 
have 

a(0) = &.(0) = c..,(0) = e(0) = g(0) = 0. 

For an arbitrary strategy X, we define nt(X) = J* PudXu + + C{t, Xt, Dt, St5t,X^t)- This 

is the cost of the strategy which is equal to X up to time t and is then optimal. Therefore, (n/(Ai),t G [0,T]) 
has to be a submartingale and is a martingale if, and only if, X is optimal. We define 


dAf = 


Z{t, Xt, Dt,St, 5t,x:t) + dtC - pDtddC - ^ ^ /3,(Ej - 2n^/p)d^fi 


i=l 


i=l 


dt, (45) 


where the derivatives of C are taken in (t, Xt, Dt, StSt, E/) and Z(t, x, d, z, 5, E) := 


/ 2^ P \ V V 

- y^[Ei + (5j] E[C(t,a:,d + (1 - v)V, z + iyV,S + ips-d - )w, E + (ps+d — )w) - C{t,x, d, z, S, E)] 

\ 2 TOi mi ■' 


V 


V 


+ ( ^ “ '^*1 jE[C(t,a:,d- (1 - iz)V,z- vV,5 - ips-c{.—)w,Ys + ips+c{ — )w) - C{t,x,d,z,5,Y?)\, 


i=l 


mi 


with V ^ p, ips-c = Ps — Pc and ps+c = Ps + Pc- The process is continuous and such that nt(Ai) — 
is a martingale. Given the quadratic nature of the problem, we search a process A^ of the form 


dAf = 


1 - 1 / 


-dt X 


j{T-t)iDt-{l-iz)Xt) - Dt + j]fc,(T-t)dJ 


(46) 
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We now introduce the variables y = d — {1 — v)x and ^ = z — vx and work with {y,d,^,S,Y]) instead of 
{x, d, z, S, E). From (IHl) and the definition of Z, we have 

dtC{t,x,d,z,S,Z,) = -a y'^ - y^k k - 6,dj - - g, 

— pd ddC(t,x,d, z,6,Il) = — ( 2pa -\—-— ) dy + —-— d^ — pd^^ biSi, 

\ \ — V J 1 — V 


- /3i6i dsiC{t,x,d,z,S,T,) = -Pibi Siy - fdA 


‘2Ci^i6i + Ci,jSj 

j¥^i 

- - 2 koo / p ) d^^C{t,x,d,z,S,E) = -PiCi + 2PiK^ei/p, 

p \ p 


Z{t,x,d, z,6,T,) = mi x 


2(1 — i')a + iz + 


1 - 


fc=i 


p p 


+EE 

i=i j=i 


(1 - u)mibi + 2^ Ci, 


kO-k 


k=l 


5i5j 


+ E 


TO2 X 


{l-vfa + v{l-v/2)-- 


2=1 


P P 


Y^akbk] yJ26, - 


2=1 


+ (i-oE Ukbk + a EE Ck,iWkWi ^'^{ak-\-2wkic)ek\ E^, 


A:=l 


fe=lz=i 


fe=i 


with a = E[l/ X ((^s — ipc){V/mi)], a = E[(i^s — </Jc)^(l^/'mi)]. We now identify each term of equations (H51) 
and (gni). 

(Eq. dy)-. - (2pa + ^) = -fy, (Eq. y^): -d = 

These two equations are the same as in [T] and give 

1 iz 


ji'^) — t;- a{u) = 


2 + pu 


1 — iz \2 + pu 2 


(47) 


(Eq. Sty): -h - PA + Z)fe=i + mi x 2{1 - ij)a + +-^ = -^jh. 

(Eq. 5dy. - Ph - ^ = 

which yields A:i(u) = ^ bi{u) + Plugging this in (Eq. we have 6* = + X)fe=i a/c^fc- 

+ mi 2(1 — iy)a + v + , and since j/{l — v) = a + !^/[2(l — v)], we have bi{u) = 

-PA{u) + YX=i akbkiu) - 2 ^bi{u) + X We rewrite it as 


b{u) = 


-H - 


2 + pu ^ 


b{u) 




1-iz 2- 


pu 


(48) 


where Ip G is the identity matrix and H G is given by (I^Tl) . To solve equation (|48l) . we search a 

solution of the form b{u) = 2 +pu ^ [exp(—u77) . 6(u)] for u > 0. This yields 

1 r 1 iTi 7/ M 1 + i^pu j- 

X [exp(-ull) . b(u)] = - - X —- X (I,-- - ,1) , 


2 + pu 


1 — ly 2 + pu 


thus 


mi 


b(u) = j- X (1 + vpu) X [exp(u77) . (1, • • • , 1) ]• 
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From the definition (1^ . we have exp{—uH). + I'ps) x exp(siJ)ds] = uC,{uH)-\-vpu^uj{uH) for u > 0. 

Since 6(0) = 26(0) = 0, we obtain 

6(u) = X — X [{C{uH) + vpu u}{uH)} . (1, • • • , l)"^]- (49) 

L ly Zd x" pu 

Equation (Eq: 5id) then gives the vector function k{u) 

Ku) = ^ X (4 + n X \C{uH) + vpu uj{uH)\ I .(1,---,1)^. (50) 

2p y i + pu J 

Thus, the functions j and k involved in ()46l) are explicit, which guarantees that the optimal strategy is 
obtained as a closed formula. 

The remaining functions Cij, Ci and g do not play any role to determine the optimal strategy. By identifying 
the terms in Si6j, and the constant term, we check that they solve a system of linear ODEs. They are 
thus uniquely determined and well-defined on R+, and the cost function C is well-defined. Thes ODEs are 
also important to run the verification argument, i.e. to check that C is indeed the optimal cost function and 
that the strategy X* described below is the optimal one. 

We now determine the strategy X* such that n(X*) is a martingale, or equivalently such that is 

constant. Equations (HS)) and (1471) yield 

p 1 ^ 

+ Ydk^T-t) SI 

P 1 2 

[l + p{T-t)]Dt - [2 + piT-t)] Y,h{T-t)Sl . 

2 = 1 


d.lf = 


-dt X 


1 - J/ 

p/{l-v) 


A - (1 - v)Xt 
2 + p(T- t) 


- Dt 


dt X 


{l-v)Xt + 


[2 + p{T-t)f 
Thus, is constant on (0, T) if, and only if 


p 

a.s. , dt -a.e. on (0, T) , (1 - u)X; = - [1 + p{T - t)] A + [2 -F p{T - t)] - t) 6). (51) 

2=1 


This equation characterizes the optimal strategy. In particular, we obtain its initial jump AXq at time t = 0 


(1 - ,,)AX* 


[1 -I- pT]qDo + xq 
2 + pT 




[ip + X [C{TH) + vpT uj{TH)]^ . So 


where 6o = (^m ' ‘ ‘ i ^ 
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